STA4173 Lecture 10, Summer 2023
Before today, we have focused on continuous outcomes.
Now we will focus on categorical (or qualitative) outcomes.
Today, we will review how to test one and two sample proportions.
We will estimate a proportion using \hat{p}, \hat{p} = \frac{x}{n}.
We will estimate the difference between two proportions using \hat{p}_1 - \hat{p}_2, \hat{p_1}- \hat{p_2} = \frac{x_1}{n_1} - \frac{x_2}{n_2}
(1–\alpha)100% CI for a population proportion, p \hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}
To construct this interval, we require both:
We will use either the binom.test() function or the prop.test() function.
If we have n \le 30,
Humira is a medication used to treat rheumatoid arthritis (RA).
In clinical trials of Humira, 705 subjects diagnosed with RA were administered 40 mg of Humira every other week.
Of the 705 subjects, 66 reported nausea as a side effect.
It is known that the proportion of RA subjects in similar studies receiving a placebo who report nausea as a side effect is 0.08.
Does the sample evidence represent significant evidence that a higher proportion of subjects receiving Humira experience nausea as a side effect than those taking a placebo?
Use the \alpha = 0.05 level of significance.
What are the important pieces?
Humira is a medication used to treat rheumatoid arthritis (RA).
In clinical trials of Humira, 705 subjects diagnosed with RA were administered 40 mg of Humira every other week.
Of the 705 subjects, 66 reported nausea as a side effect.
It is known that the proportion of RA subjects in similar studies receiving a placebo who report nausea as a side effect is 0.08.
Does the sample evidence represent significant evidence that a higher proportion of subjects receiving Humira experience nausea as a side effect than those taking a placebo?
Use the \alpha = 0.05 level of significance.
What are the important pieces?
What is the point estimate, \hat{p}?
What is the 95% confidence interval for p?
Are there a higher proportion of subjects taking Humira experiencing nausea as a side effect than those taking a placebo?
Of the 705 subjects, 66 reported nausea as a side effect.
It is known that the proportion of RA subjects in similar studies receiving a placebo who report nausea as a side effect is 0.08.
1-sample proportions test without continuity correction
data: 66 out of 705, null probability 0.5
X-squared = 465.71, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.07426251 0.11737620
sample estimates:
p
0.09361702
Of the 705 subjects, 66 reported nausea as a side effect.
It is known that the proportion of RA subjects in similar studies receiving a placebo who report nausea as a side effect is 0.08.
1-sample proportions test without continuity correction
data: 66 out of 705, null probability 0.08
X-squared = 1.7761, df = 1, p-value = 0.09131
alternative hypothesis: true p is greater than 0.08
95 percent confidence interval:
0.07709288 1.00000000
sample estimates:
p
0.09361702
Hypotheses:
Test Statistic and p-Value
Rejection Region
Conclusion / Interpretation
Fail to reject H_0.
There is not sufficient evidence to suggest that the proportion of subjects taking Humira who experience nausea is greater than 0.08.
Which do you think is easier to raise – a boy or a girl?
When asked this question in 1947, 24% of all Americans said raising a girl was easier.
In June 2018, the Gallup Organization surveyed 1500 adult Americans, of which 408 felt it was easier to raise a girl.
Does this result suggest the proportion of adult Americans who believe it is easier to raise a girl has changed since 1947?
Test at the \alpha=0.10 level.
What are the important pieces?
Which do you think is easier to raise – a boy or a girl?
When asked this question in 1947, 24% of all Americans said raising a girl was easier.
In June 2018, the Gallup Organization surveyed 1500 adult Americans, of which 408 felt it was easier to raise a girl.
Does this result suggest the proportion of adult Americans who believe it is easier to raise a girl has changed since 1947?
Test at the \alpha=0.10 level.
What are the important pieces?
1-sample proportions test without continuity correction
data: 408 out of 1500, null probability 0.24
X-squared = 8.4211, df = 1, p-value = 0.003709
alternative hypothesis: true p is not equal to 0.24
95 percent confidence interval:
0.2500845 0.2950804
sample estimates:
p
0.272
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion / Interpretation
Reject H_0.
There is sufficient evidence to suggest that the proportion of adult Americans who believe that it is easier to raise a girl has changed since 1947.
(1–\alpha)100% CI for p_1-p_2 (\hat{p}_1 - \hat{p}_2) \pm z_{\alpha/2} \sqrt{\frac{\hat{p}_1 (1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}} where
The point estimate, \hat{p}_1-\hat{p}_2, is computed as \hat{p}_1 - \hat{p}_2 = \frac{x_1}{n_1} - \frac{x_2}{n_2}, and
x_i is the number of individuals in group i with a specified characteristic
n_i is the sample size for group i
To construct this interval, we require:
prop.test() function.In clinical trials of Nasonex, 3774 adult and adolescent allergy patients (patients 12 years and older) were randomly divided into two groups.
The patients in group 1 (experimental group) received 200 \mug of Nasonex.
The patients in group 2 (control group) received a placebo.
Is there evidence to conclude that the proportion of Nasonex users who experienced headaches as a side effect is greater than the proportion in the control group?
Test at the \alpha= 0.05 level of significance.
What are the important pieces?
In clinical trials of Nasonex, 3774 adult and adolescent allergy patients (patients 12 years and older) were randomly divided into two groups.
The patients in group 1 (experimental group) received 200 \mug of Nasonex.
The patients in group 2 (control group) received a placebo.
Is there evidence to conclude that the proportion of Nasonex users who experienced headaches as a side effect is greater than the proportion in the control group?
Test at the \alpha= 0.05 level of significance.
What are the important pieces?
Of the 2103 patients in the experimental group, 547 reported headaches as a side effect.
Of the 1671 patients in the control group, 368 reported headaches as a side effect.
2-sample test for equality of proportions without continuity correction
data: c(547, 368) out of c(2103, 1671)
X-squared = 8.0618, df = 1, p-value = 0.004521
alternative hypothesis: two.sided
95 percent confidence interval:
0.01255827 0.06719613
sample estimates:
prop 1 prop 2
0.2601046 0.2202274
Thus, \hat{p}_{\text{Exp}} = 0.260, \hat{p}_{\text{Ctrl}} = 0.220 and \hat{p}_{\text{Exp}} - \hat{p}_{\text{Ctrl}} = 0.04.
The 95% CI for \hat{p}_{\text{Exp}} - \hat{p}_{\text{Ctrl}} is (0.013, 0.067).
2-sample test for equality of proportions without continuity correction
data: c(547, 368) out of c(2103, 1671)
X-squared = 8.0618, df = 1, p-value = 0.00226
alternative hypothesis: greater
95 percent confidence interval:
0.01695043 1.00000000
sample estimates:
prop 1 prop 2
0.2601046 0.2202274
Hypotheses
Test Statistic and p-value
Rejection Region
Conclusion / Interpretation
Reject H_0.
There is sufficient evidence to suggest that the proportion of Nasonex users who experienced headaches as a side effect is greater than that of the control group.
A professor from the Department of Art wants to determine citizen support of spending federal tax money on the arts.
Of a random sample of 220 women, 59 responded yes.
Another random sample of 175 men showed that 56 responded yes.
Does this information indicate a difference between the population proportion of women and the population proportion of men who favor spending more federal tax dollars on the arts?
Test at the \alpha=0.01 level.
What are the important pieces?
A professor from the Department of Art wants to determine citizen support of spending federal tax money on the arts.
Of a random sample of 220 women, 59 responded yes.
Another random sample of 175 men showed that 56 responded yes.
Does this information indicate a difference between the population proportion of women and the population proportion of men who favor spending more federal tax dollars on the arts?
Test at the \alpha=0.01 level.
What are the important pieces?
Of a random sample of 220 women, 59 responded yes.
Another random sample of 175 men showed that 56 responded yes.
2-sample test for equality of proportions without continuity correction
data: c(59, 56) out of c(220, 175)
X-squared = 1.2681, df = 1, p-value = 0.2601
alternative hypothesis: two.sided
95 percent confidence interval:
-0.14239144 0.03875508
sample estimates:
prop 1 prop 2
0.2681818 0.3200000
Thus, \hat{p}_{\text{W}} = 0.268, \hat{p}_{\text{M}} = 0.320 and \hat{p}_{\text{W}} - \hat{p}_{\text{M}} = -0.052.
The 95% CI for \hat{p}_{\text{M}} - \hat{p}_{\text{W}} is (-0.142, 0.039).
Hypotheses
Test Statistic and p-value
Rejection Region
Conclusion / Interpretation
Fail to reject H_0.
There is not sufficient evidence to suggest that there is a difference in the proportion of men and women who favor spending more federal tax dollars on the arts.
The goodness-of-fit test allows us to determine if a frequency distribution follows a specific distribution.
This could be a named distribution (e.g., normal)
It could also be a distribution without a name (e.g., the probabilities are specified)
Before we can perform the goodness-of-fit test, we must compute expected counts.E_i = n p_i
Hypotheses
Test Statistic
p-Value
Rejection Region
chisq.test() function and plug in both the counts and the expected probabilitiesUsing the economy data, below (based on the 2017 Current Population Survey, adjusted for inflation), determine if there is evidence to suggest that the distribution of income has changed since 2000.
Test at the \alpha = 0.05 level of significance.
| Income | Observed | Probability |
|---|---|---|
| Under $15,000 | 161 | 0.099 |
| $15,000 - $24,999 | 144 | 0.098 |
| $25,000 - $34,999 | 138 | 0.093 |
| $35,000 - $49,999 | 184 | 0.135 |
| $50,000 - $74,999 | 247 | 0.179 |
| $75,000 - $99,999 | 188 | 0.131 |
| $100,000 - $149,999 | 217 | 0.149 |
| $150,000 - $199,999 | 105 | 0.061 |
| Over $200,000 | 116 | 0.055 |
counts <- c(161, 144, 138, 184, 247, 188, 217, 105, 116) # create O_i vector
probs <- c(0.099, 0.098, 0.093, 0.135, 0.179, 0.131, 0.149, 0.061, 0.055) # create p_i vector
chisq.test(counts, p = probs)
Chi-squared test for given probabilities
data: counts
X-squared = 20.693, df = 8, p-value = 0.00801
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Reject H_0.
There is sufficient evidence to suggest that the distribution of income in 2017 does not follow the same distribution as in 2000.
An obstetrician wants to know whether the proportion of children born on each day of the week is the same.
She randomly selects 500 birth records and obtains the data shown in the table below (based on data obtained from Vital Statistics of the United States, 2016).
Is there reason to believe that the day on which a child is born does not occur with equal frequency at the \alpha = 0.01 level of significance?
| Sun | Mon | Tues | Weds | Thurs | Fri | Sat |
|---|---|---|---|---|---|---|
| 46 | 76 | 83 | 81 | 81 | 80 | 53 |
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Let us now discuss testing two categorical variables to determine if a relationship exists.
Take, for example, this data:
We will use the \chi^2 test for independence to determine if happiness depends on marital status.
Hypotheses
Test Statistic
p-Value
Rejection Region
matrix() (see example) and use the chisq.test() function.If given raw data, we can use the CrossTable() function in the gmodels package.
observed_table <- matrix(c(600, 63, 112, 144,
720, 142, 355, 459,
93, 51, 119, 127),
nrow = 3, ncol = 4, byrow = T)
# I prefer to include breaks to make it look like the table given just for checking purposes
# make sure you edit the number of rows (nrow) and columns (ncol)!
rownames(observed_table) <- c("Very Happy", "Pretty Happy", "Not Too Happy") # name rows
colnames(observed_table) <- c("Married", "Widowed", "Divorced/Separated", "Never Married") # name cols
observed_table # print table to make sure it is what we want Married Widowed Divorced/Separated Never Married
Very Happy 600 63 112 144
Pretty Happy 720 142 355 459
Not Too Happy 93 51 119 127
Pearson's Chi-squared test
data: observed_table
X-squared = 224.12, df = 6, p-value < 2.2e-16
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Reject H_0.
There is sufficient evidence to suggest that happiness depends on marital status.
CrossTable() function works, let’s explore the Palmer penguin dataset.library(gmodels)
penguins <- palmerpenguins::penguins
CrossTable(penguins$species, penguins$sex,
prop.chisq= FALSE, # turn off proportion contributed to chi-square statistic
prop.t = FALSE, # turn off total proportions
chisq = TRUE) # request chi-square test
Cell Contents
|-------------------------|
| N |
| N / Row Total |
| N / Col Total |
|-------------------------|
Total Observations in Table: 333
| penguins$sex
penguins$species | female | male | Row Total |
-----------------|-----------|-----------|-----------|
Adelie | 73 | 73 | 146 |
| 0.500 | 0.500 | 0.438 |
| 0.442 | 0.435 | |
-----------------|-----------|-----------|-----------|
Chinstrap | 34 | 34 | 68 |
| 0.500 | 0.500 | 0.204 |
| 0.206 | 0.202 | |
-----------------|-----------|-----------|-----------|
Gentoo | 58 | 61 | 119 |
| 0.487 | 0.513 | 0.357 |
| 0.352 | 0.363 | |
-----------------|-----------|-----------|-----------|
Column Total | 165 | 168 | 333 |
| 0.495 | 0.505 | |
-----------------|-----------|-----------|-----------|
Statistics for All Table Factors
Pearson's Chi-squared test
------------------------------------------------------------
Chi^2 = 0.04860717 d.f. = 2 p = 0.9759894
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Fail to reject H_0.
There is not sufficient evidence to suggest that the biological sex of penguins depends on species of penguin.
We previously discussed comparing two proportions that are independent.
Let us now consider two dependent proportions.
Example
We want to determine whether there is a difference between two ointments meant to treat poison ivy.
Suppose we apply ointment A on one arm and ointment B on the other arm of each individual and record if the poison ivy cleared up.
The individuals in each group are not independent because each ointment is applied to the same individual.
We will apply McNemar’s test for two dependent proportions.
This requires a table as below:
The data for the poison ivy example is as follows:
Hypotheses
Test Statistic
p-Value
Rejection Region
mcnemar.test() function.observed_table <- matrix(c(293, 43,
31, 103),
nrow = 2, ncol = 2, byrow = T)
# I prefer to include breaks to make it look like the table given just for checking purposes
rownames(observed_table) <- c("B Healed", "B Did Not Heal") # name rows
colnames(observed_table) <- c("A Healed", "A Did Not Heal") # name cols
observed_table # print table to make sure it is what we want A Healed A Did Not Heal
B Healed 293 43
B Did Not Heal 31 103
McNemar's Chi-squared test
data: observed_table
McNemar's chi-squared = 1.9459, df = 1, p-value = 0.163
Hypotheses
Test Statistic
Rejection Region
Conclusion/Interpretation
Fail to reject H_0.
There is not sufficient evidence to suggest that there is a difference between the two ointments.
A recent General Social Survey asked the following two questions of a random sample of 1492 adult Americans under the hypothetical scenario that the government suspected that a terrorist act was about to happen:
Do you believe that the authorities should have the right to tap people’s telephone conversations?
Do you believe that the authorities should have the right to stop and search people on the street at random?
The results are as follows,
Do the proportion of people who agree with each scenario differ significantly? Use the \alpha = 0.05 level of significance.
observed_table <- matrix(c(494, 335,
126, 537),
nrow = 2, ncol = 2, byrow = T)
rownames(observed_table) <- c("Tap - Agree", "Tap - Disagree") # name rows
colnames(observed_table) <- c("Stop - Agree", "Stop - Disagree") # name cols
observed_table # print table to make sure it is what we want Stop - Agree Stop - Disagree
Tap - Agree 494 335
Tap - Disagree 126 537
McNemar's Chi-squared test
data: observed_table
McNemar's chi-squared = 94.753, df = 1, p-value < 2.2e-16
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Reject H_0.
There is sufficient evidence to suggest that there is a difference in responses between the two questions.